Goto

Collaborating Authors

 boundary attack




Supplementary Material for Biologically Inspired Mechanisms for Adversarial Robustness

Neural Information Processing Systems

Results from these preliminary experiments were not reported in the paper but we report the results here in the supplementary materials. The standard bounding boxes were used as provided with the ImageNet dataset. If images had 0 bounding boxes, they were discarded for this dataset. See Section 2 in the main paper and consult Bashivan et al. (2019) for full details on the sampling procedure and chosen parameters. Code from Bashivan et al. (2019) was open-sourced at Biological measurements (Gattass et al. (1981, 1988)) have demonstrated that in primates, the As described in the paper, we employed two baseline models ('ResNet' and'coarse fixations') and two The'ResNet' baseline model directly feeds the full image through a standard ResNet architecture (32x32 for CIFAR10 or 320x320 for ImageNet).


Reviewer

Neural Information Processing Systems

We greatly appreciate that both R1 and R2 consider our paper to be well-written/clearly presented. On CIFAR-10, the boundary attack achieves a final average MSE of 0.009 (which FPR = 0.1, which is much higher than what we've obtained on white-box attacks (Table 2). FPR setting we used in our experiments. ImageNet dataset, which few works have experimented or succeeded on (including Madry's, which only evaluates on We hope that this aspect of our study can be appreciated. While the two properties are indeed known, our observations and analysis (cf.



OSLO: One-Shot Label-Only Membership Inference Attacks

arXiv.org Artificial Intelligence

We introduce One-Shot Label-Only (OSLO) membership inference attacks (MIAs), which accurately infer a given sample's membership in a target model's training set with high precision using just \emph{a single query}, where the target model only returns the predicted hard label. This is in contrast to state-of-the-art label-only attacks which require $\sim6000$ queries, yet get attack precisions lower than OSLO's. OSLO leverages transfer-based black-box adversarial attacks. The core idea is that a member sample exhibits more resistance to adversarial perturbations than a non-member. We compare OSLO against state-of-the-art label-only attacks and demonstrate that, despite requiring only one query, our method significantly outperforms previous attacks in terms of precision and true positive rate (TPR) under the same false positive rates (FPR). For example, compared to previous label-only MIAs, OSLO achieves a TPR that is 7$\times$ to 28$\times$ stronger under a 0.1\% FPR on CIFAR10 for a ResNet model. We evaluated multiple defense mechanisms against OSLO.


Adversarial Attacks on Deep Learning Systems for User Identification based on Motion Sensors

arXiv.org Machine Learning

For the time being, mobile devices employ implicit authentication mechanisms, namely, unlock patterns, PINs or biometric-based systems such as fingerprint or face recognition. While these systems are prone to well-known attacks, the introduction of an explicit and unobtrusive authentication layer can greatly enhance security. In this study, we focus on deep learning methods for explicit authentication based on motion sensor signals. In this scenario, attackers could craft adversarial examples with the aim of gaining unauthorized access and even restraining a legitimate user to access his mobile device. To our knowledge, this is the first study that aims at quantifying the impact of adversarial attacks on machine learning models used for user identification based on motion sensors. To accomplish our goal, we study multiple methods for generating adversarial examples. We propose three research questions regarding the impact and the universality of adversarial examples, conducting relevant experiments in order to answer our research questions. Our empirical results demonstrate that certain adversarial example generation methods are specific to the attacked classification model, while others tend to be generic. We thus conclude that deep neural networks trained for user identification tasks based on motion sensors are subject to a high percentage of misclassification when given adversarial input.


Copy and Paste: A Simple But Effective Initialization Method for Black-Box Adversarial Attacks

arXiv.org Machine Learning

Many optimization methods for generating black-box adversarial examples have been proposed, but the aspect of initializing said optimizers has not been considered in much detail. We show that the choice of starting points is indeed crucial, and that the performance of state-of-the-art attacks depends on it. First, we discuss desirable properties of starting points for attacking image classifiers, and how they can be chosen to increase query efficiency. Notably, we find that simply copying small patches from other images is a valid strategy. In an evaluation on ImageNet, we show that this initialization reduces the number of queries required for a state-of-the-art Boundary Attack by 81%, significantly outperforming previous results reported for targeted black-box adversarial examples.


Conditional Generative Models are not Robust

arXiv.org Machine Learning

Class-conditional generative models are an increasingly popular approach to achieve robust classification. They are a natural choice to solve discriminative tasks in a robust manner as they jointly optimize for predictive performance and accurate modeling of the input distribution. In this work, we investigate robust classification with likelihood-based conditional generative models from a theoretical and practical perspective. Our theoretical result reveals that it is impossible to guarantee detectability of adversarial examples even for near-optimal generative classifiers. Experimentally, we show that naively trained conditional generative models have poor discriminative performance, making them unsuitable for classification. This is related to overlooked issues with training conditional generative models and we show methods to improve performance. Finally, we analyze the robustness of our proposed conditional generative models on MNIST and CIFAR10. While we are able to train robust models for MNIST, robustness completely breaks down on CIFAR10. This lack of robustness is related to various undesirable model properties maximum likelihood fails to penalize. Our results indicate that likelihood may fundamentally be at odds with robust classification on challenging problems.


Boundary Attack++: Query-Efficient Decision-Based Adversarial Attack

arXiv.org Machine Learning

Deep neural networks have achieved state-of-the-art performance on a variety of tasks. But they have been shown to be vulnerable to adversarial examples, which are maliciously perturbed examples almost identical to original samples in human perception, but cause models to make incorrect decisions [31]. The vulnerability of neural networks to adversarial examples implies a security risk in applications with real-world consequences, such as self-driving cars, robotics, financial services, and criminal justice, and also suggests a difference between humans and existing machine learning systems. The study of adversarial examples is thus necessary to identify the limitation of current machine learning algorithms, provide a metric for robustness, investigate the potential risk, and suggest ways to improve the robustness of models. Considerable effort has gone into the design of new algorithms for the generation of adversarial examples. Adversarial examples can be categorized according to several criteria: the similarity metric, the attack goal, and the threat model.